A shared latent space matrix factorisation method for recommending new trial evidence for systematic review updates

نویسندگان

  • Didi Surian
  • Adam G. Dunn
  • Liat Orenstein
  • Rabia Bashir
  • Enrico W. Coiera
  • Florence T. Bourgeois
چکیده

BACKGROUND Clinical trial registries can be used to monitor the production of trial evidence and signal when systematic reviews become out of date. However, this use has been limited to date due to the extensive manual review required to search for and screen relevant trial registrations. Our aim was to evaluate a new method that could partially automate the identification of trial registrations that may be relevant for systematic review updates. MATERIALS AND METHODS We identified 179 systematic reviews of drug interventions for type 2 diabetes, which included 537 clinical trials that had registrations in ClinicalTrials.gov. Text from the trial registrations were used as features directly, or transformed using Latent Dirichlet Allocation (LDA) or Principal Component Analysis (PCA). We tested a novel matrix factorisation approach that uses a shared latent space to learn how to rank relevant trial registrations for each systematic review, comparing the performance to document similarity to rank relevant trial registrations. The two approaches were tested on a holdout set of the newest trials from the set of type 2 diabetes systematic reviews and an unseen set of 141 clinical trial registrations from 17 updated systematic reviews published in the Cochrane Database of Systematic Reviews. The performance was measured by the number of relevant registrations found after examining 100 candidates (recall@100) and the median rank of relevant registrations in the ranked candidate lists. RESULTS The matrix factorisation approach outperformed the document similarity approach with a median rank of 59 (of 128,392 candidate registrations in ClinicalTrials.gov) and recall@100 of 60.9% using LDA feature representation, compared to a median rank of 138 and recall@100 of 42.8% in the document similarity baseline. In the second set of systematic reviews and their updates, the highest performing approach used document similarity and gave a median rank of 67 (recall@100 of 62.9%). CONCLUSIONS A shared latent space matrix factorisation method was useful for ranking trial registrations to reduce the manual workload associated with finding relevant trials for systematic review updates. The results suggest that the approach could be used as part of a semi-automated pipeline for monitoring potentially new evidence for inclusion in a review update.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatizing the Assignment of the Submitted Manuscripts to Reviewers: A Systematic Review of Research Texts

Purpose: To systematicly review the automatazation of the assignment of the submitted manuscripts to reviewers in order to identify the status of research studies in this field in terms of types of evidence of expertise, types of retrieval models used, and the research gaps, and finally some suggestions for has been offered for future research. Method: The current research followed the systema...

متن کامل

Online Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features

Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...

متن کامل

Approximate Analytic Matrix Factorisations as Preconditioners for Newton's Method

This paper treats direct methods of matrix factorisation as being approximated by analytic recursions. From this viewpoint, a new class of preconditioners is motivated, which is denoted by approximate analytic factorisation (AAF). Due to analyticity and exploitation of invariances, their computational and storage cost is usually rather cheap. AAF are derived for a variety of practically relevan...

متن کامل

Generalised Bayesian matrix factorisation models

Factor analysis and related models for probabilistic matrix factorisation are of central importance to the unsupervised analysis of data, with a colourful history more than a century long. Probabilistic models for matrix factorisation allow us to explore the underlying structure in data, and have relevance in a vast number of application areas including collaborative filtering, source separatio...

متن کامل

مرور سیستماتیک “Systematic Review” چیست وچگونه نگاشته می‌شود؟

Abstract Background: Successful clinical decisions are the outcome of a complex process. In making them, we draw on information from scientific evidences, our personal experience and external rules and constraints. Considering that the explosive increase in the amount and quality of the scientific evidence that has come from both the laboratory bench and the bedside, we may lack the time, mo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of biomedical informatics

دوره 79  شماره 

صفحات  -

تاریخ انتشار 2018